Search CORE

45 research outputs found

A probabilistic model for gene content evolution with duplication, loss, and horizontal transfer

Author: A.B. Simonson
B. Boussau
B. Snel
B. Snel
B.E. Dutilh
B.G. Mirkin
C. Pál
C.G. Kurland
D.H. Huson
E. Belda
E.A. Herniou
E.D. Green
E.J. Deeds
E.L.L. Sonnhammer
E.V. Koonin
F. Delsuc
F. Tekaia
G.D.P. Clarke
G.P. Karev
G.P. Karev
G.P. Karev
I.K. Jordan
J. Lin
J.A. Lake
J.O. Korbel
J.P. Gogarten
J.T. Herbeck
K.H. Wolfe
M. Csűrös
M. Pellegrini
M.G. Montague
M.W. Hahn
R.L. Tatusov
S. Karlin
S. Yang
S.T. Fitz-Gibbon
T. Pupko
V. Kunin
V. Kunin
W. Feller
W.J. Reed
X. Gu
Y. Boucher
Y.I. Wolf
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/09/2005
Field of study

We introduce a Markov model for the evolution of a gene family along a phylogeny. The model includes parameters for the rates of horizontal gene transfer, gene duplication, and gene loss, in addition to branch lengths in the phylogeny. The likelihood for the changes in the size of a gene family across different organisms can be calculated in O(N+hM^2) time and O(N+M^2) space, where N is the number of organisms,

h

is the height of the phylogeny, and M is the sum of family sizes. We apply the model to the evolution of gene content in Preoteobacteria using the gene families in the COG (Clusters of Orthologous Groups) database

arXiv.org e-Print Archive

CiteSeerX

Crossref

A Model of Problem Solving Environment for Integrated Bioinformatics Solution on Grid by Using Condor

Author: R.L. Tatusov
S. Altschul
T. Smith
W. Pearson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Abstract. To solve the real-world bioinformatics problems on grid, the integration of various analysis tools is necessary in addition to the imple-mentation of basic tools. Workflow based problem solving environment on grid can be the efficient solution for this type of software development. Here we propose a model of simple problem solving environment that enables component based workflow design of integrated bioinformatics applications on Grid environment by using Condor functionalities.

CiteSeerX

Crossref

A Simple Iterative Approach to Parameter Optimization

Author: Alexander Zien
Ralf Zimmer
Tatusov R.L.
Thomas Lengauer
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

Metabolism and evolution of Haemophilus influenzae deduced from a whole genome comparison with Escherichia coli

Author: Bork P.
Borodovsky M.
Brown N.P.
Hayes W.S.
Koonin E.V.
Mushegian A.R.
Rudd K.E.
Tatusov R.L.
Publication venue: 'Elsevier BV'
Publication date: 01/01/1996
Field of study

BACKGROUND: The 1.83 Megabase (Mb) sequence of the Haemophilus influenzae chromosome, the first completed genome sequence of a cellular life form, has been recently reported. Approximately 75 % of the 4.7 Mb genome sequence of Escherichia coli is also available. The life styles of the two bacteria are very different - H. influenzae is an obligate parasite that lives in human upper respiratory mucosa and can be cultivated only on rich media, whereas E. coli is a saprophyte that can grow on minimal media. A detailed comparison of the protein products encoded by these two genomes is expected to provide valuable insights into bacterial cell physiology and genome evolution. RESULTS: We describe the results of computer analysis of the amino-acid sequences of 1703 putative proteins encoded by the complete genome of H. influenzae. We detected sequence similarity to proteins in current databases for 92 % of the H. influenzae protein sequences, and at least a general functional prediction was possible for 83 %. A comparison of the H. influenzae protein sequences with those of 3010 proteins encoded by the sequenced 75 % of the E. coli genome revealed 1128 pairs of apparent orthologs, with an average of 59 % identity. In contrast to the high similarity between orthologs, the genome organization and the functional repertoire of genes in the two bacteria were remarkably different. The smaller genome size of H. influenzae is explained, to a large extent, by a reduction in the number of paralogous genes. There was no long range colinearity between the E. coli and H. influenzae gene orders, but over 70 % of the orthologous genes were found in short conserved strings, only about half of which were operons in E. coli. Superposition of the H. influenzae enzyme repertoire upon the known E. coli metabolic pathways allowed us to reconstruct similar and alternative pathways in H. influenzae and provides an explanation for the known nutritional requirements. CONCLUSIONS: By comparing proteins encoded by the two bacterial genomes, we have shown that extensive gene shuffling and variation in the extent of gene paralogy are major trends in bacterial evolution; this comparison has also allowed us to deduce crucial aspects of the largely uncharacterized metabolism of H. influenzae

Elsevier - Publisher Connector

MDC Repository

Phylogenomic Study of Lipid Genes Involved in Microalgal Biofuel Production—Candidate Gene Mining and Metabolic Pathway Analyses

Author: Bondaruk M.
Eccleston V.S.
Gasteiger E.
Hall T.A.
Ikai A.J.
Tatusov R.L.
Wheelock C.E.
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

An Algorithm for Hierarchical Classification of Genes of Prokaryotic Genomes

Author: A. Bairoch
A. Balows
A. Kato
D.R. Boone
E. Camon
H. Su
M.D. Ermolaeva
M.N. Price
P. Rice
R.C. Prim
R.D. Finn
R.L. Tatusov
R.L. Tatusov
S.F. Altschul
S.S. Wilks
T.F. Smith
T.H. Cormen
U.M. Fayyad
V. Olman
X. Chen
Y. Xu
Y. Zheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Crossref

Operon Prediction Using Neural Network Based on Multiple Information of Log-Likelihoods

Author: B.P. Westover
D.F. Specht
H. Salgado
H. Salgado
J.Z. Zhou
M. Pellegrini
M.D. Ermolaeva
R.L. Tatusov
X. Chen
X. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Crossref

A Fixed-Parameter Algorithm for Minimum Common String Partition with Few Duplications

Author: D.P. Lopresti
G. Shi
H. Jiang
M. Remm
P. Damaschke
P.J. Kersey
R. Overbeek
R.L. Tatusov
T. Jiang
X. Chen
Z. Fu
Publication venue
Publication date: 01/01/2013
Field of study

Abstract. Motivated by the study of genome rearrangements, the NPhard Minimum Common String Partition problems asks, given two strings, to split both strings into an identical set of blocks. We consider an extension of this problem to unbalanced strings, so that some elements may not be covered by any block. We present an efficient fixed-parameter algorithm for the parameters number k of blocks and maximum occurrence d of a letter in either string. We then evaluate this algorithm on bacteria genomes and synthetic data.

arXiv.org e-Print Archive

CiteSeerX

Crossref

Multi-granularity Parallel Computing in a Genome-Scale Molecular Evolution Application

Author: B. Nichols
C. Hall
J. Felsenstein
J. Walters
J.D. Thompson
J.E. Stajich
J.M. Squyres
K.D. Pruitt
L. Li
R.L. Tatusov
S. Alexandros
S.F. Altschul
Y. Lee
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Crossref

A Biologist’s View of Systems Integration Systems Biology

Author: A. Bateman
A. Siepel
A.L. Delcher
A.L. Delcher
F. Achard
G.O. Consortium
J.M. Rouillard
R. Apweiler
R.D. Dowell
R.L. Tatusov
S. Fischer
S. Rozen
T. Conway
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Crossref